home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Tech Arsenal 1
/
Tech Arsenal (Arsenal Computer).ISO
/
tek-02
/
pasnews.zip
/
PNL003.TXT
< prev
next >
Wrap
Text File
|
1990-06-26
|
112KB
|
2,195 lines
//////// // // //
// // /// // //
// // //// // //
//////// // // // //
// // //// //
// // /// //
// // // ///////
Pascal NewsLetter
Issue #3
June, 1990
Editor: Pete Davis
The Programmer's Forum BBS is the home of
PNL. It can be reached in Washington, DC at
(202)966-3647. Information is available
through the following locations:
FidoNet: Pete Davis@1:109/138
GEnie: P.DAVIS5
BitNet: HJ647C@GWUVM & UE356C@GWUVM
InterNet: HJ647C@GWUVM.GWU.EDU
UE356C@GWUVM.GWU.EDU
and Pete.Davis@f138.n109.z1.fidonet.org
2
Table of Contents
Introduction ..................................... Page 3 (P.D.)
Letter from Charles Falconer ..................... Page 4 (C.F.)
A Diatribe on Structured Programming and Pascal .. Page 8 (C.F.)
Pascal I/O Semantics ............................. Page 18 (C.F.)
All Sorts of Sorts ............................... Page 24 (P.D.)
For Beginners .................................... Page 29 (B.G.)
The Object of Objects ............................ Page 35 (J.R.)
Doing More With Strings .......................... Page 39 (G.H.)
Bits & Pieces .................................... Page 42 (****)
Conclusion ....................................... Page 44 (P.D.)
P.D. -> Pete Davis (Editor, Writer)
C.F. -> Charles Falconer (Contributing Writer)
J.R. -> Jim Reid (Contributing Writer)
B.G. -> Bob Gowans (Staff Writer)
G.H. -> Gerhard Hoogterp (Contributing Writer)
Bits & Pieces is contibuted by various readers and contributing
writers. Credit will be given in that Bits & Pieces area.
Turbo Pascal is a registered trademark of Borland International
3
Introduction
Well, I'm starting on issue #3 of the PNL for an unexpected
reason. I, just minutes ago, received what must have been the toughest
criticisms I have received yet. Anyone who frequents the Pascal echo
in FidoNet knows who Charles Falconer is. I received a nice-sized
packet of files from Charles regarding the PNL. It was loaded with
constructive criticism. Some of it is already noticable. Charles
pointed out that most people will go through the PNL on the screen and
would prefer single-spaced text. Since I print it out, I was making it
look good for printouts. I think Charles is right, though, so I am
taking that advice. If there are enough people who want the double-
spaced version, I will either release them together, or I will provide
the double-spaced version to those who request it.
Anyway, on with what Charles sent to me. His article was full of
advice and comments. I really found it helpful and as Charles wrote:
'If I agreed with everything I read there would be no point in reading
it!' I agree with that, but not everything you said.I did agree with
most of what you said, though. Hopefully the users will approve of the
changes. Anyway, along with the helpful comments, Charles included two
articles that he had written. How could I possibly refuse to include
them? Charles, thanks a lot. I really appreciate you advice and
encouragement. I also appreciate your articles. They'll help beef-up
the third issue a bit.
Because Charles included so many good tips and because some of
them are regarding articles I wrote in the first issue, I have
included a copy of his letter so others can see what he has to say
about Generic Structures and Optimization, two articles that were
written in the first issue.
I meant to get to my sorting article in the last issue. Obviously
that didn't happen, so I'm putting it in this issue. I have been
blessed with 4 articles so far. I can't tell you how much these
articles help out. I think it is going to be necessary to make some
small rules for submissions, though. The rules are pretty simple:
1> Single-Spaced
2> No justification
3> No page numbers
4> carriage returns only at end of paragraphs and where necessary
(i.e. creating diagrams)
I'm not going to refuse an article if it doesn't follow these
specifications, it just makes things easier for me when including it
in the newsletter.
Well, I hope you enjoy what follows.
4
Letter from Charles Falconer
I am attaching a couple of contributions - feel free to edit them
within reason. These have been around in various forms for a while.
Also feel free to grab anything I put on the Pascal Echo, if you wish,
(unless I have done something foolish). It is all public domain. If
you publish anything of mine include my Fidonet address please.
On formatting of PNL: (all opinion)
These are only suggestions, after all it is your project, and I am
thankful for it. As an aside, the results are smaller and take less
transmission time and disk space.
I find the double spacing to be virtually unreadable on the CRT. This
sort of thing is virtually never printed, and when it does the user
doesn't want to use a lot of paper. Who knows, he may even have to
feed individual sheets.
I just found these on Hippocampus, and have not really read anything
yet.
I suggest a formatting of no more than 72, preferably 66, chars per
line, with no indentation, single spaced, with a maximum of 60
lines/page. If lines are not right justified (minor point) the user
can easily reformat to taste. Pages should be separated with
form-feeds, not white space, as the white space makes it very hard to
follow anything on the CRT. For the same reason pagenumbers and the
ilk should be at page top.
I attach PNL001 reformatted to roughly these specifications, for your
consideration. I passed it through two filters, changed all
<eoln><eoln> into <eoln>, and did some minor hand editing afterwards.
The original contained some naked <cr>s, which didn't even overprint,
but were removed by my stripovr filter (the filters are available in
DETAB12.LZH on 141/205)
If you feel indentation for printing is critical, I will be happy to
write a small filter to add it in. It can also expand <ff> to a
constant 66 (or other) lines per page if needed. (Europeans have 12
inch paper standard). Form-feeds also serve to trigger page printing
on Laser Printers.
I also have RNF, in Pascal, available with source, for generalized
formatting. It can do quite a pleasant job. RNF13AT.LZH (110k), also
on 141/205. RNF13SOR.LZH has ISO standard source, 13AT is for Turbo
and includes the documentation and Turbo source. The docs are a
deliberate mess, and get cleaned up by running RNF on them.
With your permission only, I will mount the reformatted versions on
Hippocampus to replace yours. No sweat either way. {Editors Note:
Feel free to do it.}
5
Net mail to 141/209.1 will reach me, addressed to Charles Falconer.
1990/5/18 Chuck.
------
ps: I read over pnl001 and have a few minor comments:
1. Optimizing for size often DOES improve speed {ed. note: Yes, in
most cases, but not all! }. The data structure and algorithm is
usually more important than the code. Clarity is at the top of
my wants 99% of the time. Do nothing without measuring the
program first - otherwise the rewards are not going to be
noticeable, but the (clarity) penalties are. Smaller code
means more other resource available. I have a system where
I can find sharp performance improvements for every 1k less code
space used - sometimes only 128 bytes is important. In this
case the difference is availability of swap space in memory, and
the increment suddenly means something doesn't get swapped.
2. Generic procs - If the body of the system deals with the two
pointer linkage item, rather than the data body, much more
can be made generic. Only a few rare i/o procedures will
need to know about the actual data item. Thus 'push' would
push such a linkage record, pop would return one, and could
be a function. Nothing need even know they are headers after
the lowest level.
3. Fetish of mine - Pascal is ALWAYS capitalized. A proper name.
4. Optimizing - I virtually never turn off run time checks. The
exception is in the heart of loops, once well debugged. Such
loops should have loop invariants stated and checked. A whole
program is never debugged.
5. I have a different view than most about formatting, some of
which is in the attached files.
6. Generally excellent. If I agreed with everything I read there
would be no point in reading it! You seem to have pretty sound
teachers and a good mind.
pps:
Here is a useful tidbit for linked lists. This only assumes the list
pointers contain a 'next' field. Takes a single pass. Just pass it
the root pointer.
PROCEDURE reverse(VAR dirlist : adirptr);
(* Do a list walk and reverse the list order *)
VAR
prev, pprev : adirptr;
6
BEGIN (* reverse *)
prev := NIL;
WHILE dirlist <> NIL DO BEGIN
pprev := prev; prev := dirlist;
dirlist := dirlist^.next; prev^.next := pprev; END;
dirlist := prev;
END; (* reverse *)
This allows you to create a list with simple code, i.e.
new(p); p^.next := root; root := p;
(* then set any data fields *)
and later access in the order of input. One example of use is in
scanning parameter lists when compiling a procedure header. Then the
final list is used to check procedure calls correctly as they come in
from the source.
and here is the performance of MRGSORT on my system, sorting 12 char.
random alphabetic strings (packed into pksize = 4 integers, 3 per
integer). The whole of this was published on Pascal Echo, and is
available as MRGSORT.LZH on 1:141/205. The MRGSORT unit operates on
singly linked lists, only requiring that 'next' be the first field.
(* On my 8mhz V20 XT system, executes as follows: *)
(* items creation time sorting time *)
(* ----- ------------- ------------ *)
(* 10 0.013 Sec. 0.010 Sec. *)
(* 100 0.117 Sec. 0.164 Sec. *)
(* 500 0.582 Sec. 1.050 Sec. *)
(* 2500 2.903 Sec. 6.407 Sec. *)
(* 12500 14.502 Sec. 38.028 Sec. *)
(* (FULL) 33874 38.028 Sec. 113.692 Sec. *)
(* which shows the n*log(n) behaviour of the algorithm. *)
and using the following comparison procedure:
{$f+} (* passed functions MUST be far *)
FUNCTION greater(thing, than : pointer) : boolean;
(* This is the time bind - make assy language. This *)
(* will later be passed in as a param to mrgsort *)
LABEL 9, 10;
VAR
k : pkindex;
(* These gyrations bypass type checking, and describe *)
(* the actual pointer type that mrgsort will call with *)
{} a : alfaptr ABSOLUTE thing;
{} b : alfaptr ABSOLUTE than;
7
{$r-,s-}
BEGIN (* greater *)
greater := true;
FOR k := pksize DOWNTO 1 DO (* Check most sig. first *)
IF a^.s[k] > b^.s[k] THEN GOTO 10 (* decided *)
ELSE IF a^.s[k] < b^.s[k] THEN GOTO 9; (* decided *)
9: greater := false;
10: END; (* greater *)
{$r+,s+,f-} (* put the options back *)
This being the major time bind, it is the only place that range checks
are turned off. As an aside, I would have used your 'generic' linkage
method, except that it would involve an extra dereference right here
in the sensitive heart of the algorithm. Changing to such is easy.
The {} in the lh margins mark non-standard usage.
{Editors Note: 8 lines removed: Not of interest to readers.}
8
A diatribe on Structured Programming and Pascal
by C.B. Falconer, 680 Hartford Tpk, Hamden, Ct 06517. 281-1438
Structured programming is NOT about GOTO less code, or languages,
but about the design methodology. I use Pascal, (ISO standard
version, not Turbo, but the remarks generally apply). You design
a program from the top down, i.e.:
Step 1. Write down what it is going to do, and specify the data
used. Keep the data to an absolute minimum, because the less you
have to remember the fewer mistakes there will be:
(I will write a simple line copying program, assuming two extensions
to standard Pascal, i.e. read(..) can read a string, and length(..)
can return the length of that string. Most systems have these
extensions, but if not the procedures are easily created in STANDARD
Pascal)
(For Turbo users, some additional changes have to be made. These are
shown in a final addendum to this file. The problems are:
1. Turbo REQUIRES the assign operation
2. Turbo fails to automatically close files
3. Turbo fails to truncate string writes to the field value.
4. Turbo uses the predefined "string" type to enable the
read(string).
all are fairly easily handled, as can be seen in the modifications)
-------
PROGRAM something(infile, outfile, output);
(* using output for error messages, processing infile to outfile *)
(* Define all data types that will be used GLOBALLY or as parameters
*)
CONSTANT
maxbuff = 80;
TYPE
buff = ARRAY [1..maxbuff] OF char;
VAR
infile,
outfile : text;
buffer : buff;
BEGIN (* something *)
initialize;
WHILE NOT eof(infile) DO BEGIN
getdata(infile, buffer);
putdata(outfile, buffer); END;
9
cleanup;
END. (* something *)
-------
This models what you want to do, with no detail. Once you have
written the other procedures (and whatever they require) you are
done. ALL the data that is really needed is already specified.
You can add dummy procedures and check the result by compiling at
this point. Note that, as a matter of style, all data is passed
via parameters. This makes future changes much easier, and avoids
looking around for "what is this thing".
Step 2: Put in dummy procedures to resolve the program. The result
can be compiled.
-------
PROGRAM something(infile, outfile, output);
(* using output for error messages, processing infile to outfile *)
(* Define all data types that will be used GLOBALLY or as parameters
*)
CONSTANT
maxbuff = 80;
TYPE
buff = ARRAY [1..maxbuff] OF char;
VAR
infile,
outfile : text;
buffer : buff;
(* 1---------1 *)
PROCEDURE initialize;
BEGIN (* initialize *)
END; (* initialize *)
(* 1---------1 *)
PROCEDURE getdata(VAR infile : text; VAR buffer : buff);
BEGIN (* getdata *)
END; (* getdata *)
(* 1---------1 *)
PROCEDURE putdata(VAR outfile : text, buffer : buff);
10
BEGIN (* putdata *)
END; (* putdata *)
(* 1---------1 *)
PROCEDURE cleanup;
BEGIN (* cleanup *)
END; (* cleanup *)
(* 1---------1 *)
BEGIN (* something *)
initialize;
WHILE NOT eof(infile) DO BEGIN
getdata(infile, buffer);
putdata(outfile, buffer); END;
cleanup;
END. (* something *)
-------
Notice that each procedure is indented to show its control level,
and the BEGIN/ENDS are commented. This will be useful when fully
expanded, and if inserted at this stage as a habit will always be
present to help.
Notice also that each parameter is VAR only when the procedure must
be able to alter it. In some cases this may cause a minor slowdown,
but the advantage is the GUARANTEE that errors in one procedure do
not propagate into the rest of the program. This also allows the
parameter to be supplied by an expression (for integers, reals,
booleans, etc.), which cannot be done for a VAR parameter. Think
of non-VAR parameters as pre-initialized data areas for the procedure,
dynamically assigned.
Sometimes obscure bugs arise in Pascal because of failure to specify
the VAR when the parameter is to be modified. This seems to be
mentally easy to overlook. All file specification paramters must
always be VAR.
You can read this program (even though it does nothing yet) and
trivially satisfy yourself that it is correct. I.E. it will perform
the function, and will terminate.
FROM HERE ON you will NEVER change the global part. EVERYTHING
added will be in the 4 procedures that the global program calls. You
may want to tweak the type definition of buffer (that is why the
constant MAXBUFF exists).
Keeping everything simple, we will leave the cleanup procedure empty
(but another application may need it), and fill in the details for a
11
complete program.
Step 3: If complex, put in simple code to check that the outer
system runs the way you want it (starts, stops, etc). Then repeat
the design process on each individual procedure if needed. This
example is so simple that no repeat will be needed.
-------
EXCEPT for the "readln" in GETDATA, and the "length" in PUTDATA,
this will compile and run on ANY Pascal system meeting the ISO or
ANSI standards. Correction will involve the definition of buff,
and writing two custom procedures, at most.
PROGRAM something(infile, outfile, output);
(* using output for error messages, processing infile to outfile *)
(* Define all data types that will be used GLOBALLY or as parameters
*)
CONSTANT
maxbuff = 80;
TYPE
buff = ARRAY [1..maxbuff] OF char;
VAR
infile,
outfile : text;
buffer : buff;
(* 1---------1 *)
PROCEDURE initialize;
(* all that is needed to open the input and output files in standard
*)
(* Pascal. Non-Standard systems (e.g. Turbo) may need more.
*)
BEGIN (* initialize *)
reset(infile);
rewrite(outfile);
END; (* initialize *)
(* 1---------1 *)
PROCEDURE getdata(VAR infile : text; VAR buffer : buff);
(* We might want this to read one word, as an example of a different
*)
(* program using the same basic model, or to ignore non-numeric
input *)
(* For numeric input, we could send an error message if none found.
*)
12
BEGIN (* getdata *)
readln(infile, buffer);
END; (* getdata *)
(* 1---------1 *)
PROCEDURE putdata(VAR outfile : text, buffer : buff);
(* And this might want to upshift everything, or insert tabs, etc.
*)
BEGIN (* putdata *)
writeln(outfile, buffer : length(buffer));
END; (* putdata *)
(* 1---------1 *)
PROCEDURE cleanup;
BEGIN (* cleanup *)
END; (* cleanup *)
(* 1---------1 *)
BEGIN (* something *)
initialize;
WHILE NOT eof(infile) DO BEGIN
getdata(infile, buffer);
putdata(outfile, buffer); END;
cleanup;
END. (* something *)
-------
And the program is written. Each component is so simple, and deals
with so few items, that you can be sure it is correct. Almost all
errors will be syntax errors (omitted semi-colons, mis-spellings,
etc) which the compiler will catch for you.
If you are writing in a non-structured language, such as assembly or
Fortran, use an outline similar to the above, and translate it into
the language you are using. GOTO's can then be used, BUT THE FLOW
OF CONTROL is strictly structured, and thus clear. (But there is a
place for GOTOs, even in Pascal - that is another diatribe). Similarly
the use of data items is clear, even if you have to declare other
globals to replace the local variables, because of the end language.
Notice that, in Pascal, you are usually backing up to somewhere
between the procedure (or program) heading, and the executable code
for that block (marked by "BEGIN (* blockname *)" ) to insert the
next level. In Pascal you never need to worry about names, because
inserted procedures are local, and the only name conflicts that can
occur are with those names shown in the header (and any local
13
declarations). You can see all these immediately.
This all builds on the well known fact that human brains can normally
handle up to about 7 items simultaneously. Once you have to worry
about more things you will miss something. Thus, KEEP IT SIMPLE.
Build a complex system out of SIMPLE building blocks. At any level,
you can understand what is going on.
Pascal makes such coding very easy, but forces you to follow a
standard format (basically declare everything before using it).
The use of indentation to depict levels of control eases
understanding, i.e. to see what controls whether a statement is
executed, just follow up to the outer indentation level (this is a
convention, not a language feature). Everything looks more or less
like
IF condition THEN BEGIN
dothis;
andthat;
andsoforth; END
ELSE BEGIN
dosomethingelse
andsoon; END;
and you can tell at a glance what conditions will allow each group
to be executed.
If you really must use Basic, Fortran, Cobol, and other antiques,
try writing pseudo code in a structured form, keep it as comments,
and let it define the real code you write. When you make a change,
revise the pseudo-code (for a real change, not just a coding error),
and keep it current.
It is not hard to write 500 or 1000 line programs that work first
shot in this manner (excepting the typos, which the compiler usually
will catch for you).
Summary (and additions) -
KEEP EACH module SIMPLE.
Let the outer code define what each module must do.
Keep each module on one CRT screen or less.
Use parameters, even if not really needed. (You can change to
globals for efficiency later if you have to for efficiency).
AVOID proliferation of GLOBAL variables. Keep it SIMPLE.
------------------------------------------------------------------
(For Turbo users, some additional changes have to be made. These are
shown in a final addendum to this file. The problems are:
1. Turbo REQUIRES the assign operation
2. Turbo fails to automatically close files
3. Turbo fails to truncate string writes to the field value.
14
4. Turbo uses the predefined "string" type to enable the
read(string).
all are fairly easily handled, as can be seen in the modifications)
HERE IS THE MODIFIED PROGRAM, with changes inserted commented.
look for the "<<<" string.
-------
THIS Turbo VERSION will not run on an ANSI/ISO standard system.
PROGRAM something(infile, outfile, output);
(* using output for error messages, processing infile to outfile *)
(* Define all data types that will be used GLOBALLY or as parameters
*)
CONSTANT
maxbuff = 80;
TYPE
buff = string[maxbuff]; (* <<< CHANGED *)
(* ^ *)
(* ^-- one of the worst choices in Turbo is *)
(* that "string" is a reserved word (rather than *)
(* a predefined type. This causes all sorts of *)
(* nuisances in porting standard Pascal to Turbo.*)
VAR
infile,
outfile : text;
buffer : buff;
(* 1---------1 *)
PROCEDURE initialize;
(* all that is needed to open the input and output files in standard
*)
(* Pascal. Non-Standard systems (e.g. Turbo) may need more.
*)
(* 2---------2 *)
PROCEDURE attachfile(VAR f : text; position : integer); (* <<
ADD *)
(* attach the Pascal file to the commandline parameter *)
(* in position "position" of the run-time command *)
(* A style note: An IF THEN .. ELSE .. clause should *)
(* normally be written with the longer clause in the ELSE *)
(* part. This makes it easy to see the flow of control. A *)
(* short ELSE clause 28 pages down is easily missed, not to *)
(* mention the nuisance of deciding which IF it goes with. *)
15
BEGIN (* attachfile *)
IF position <= paramcount THEN assign(f, paramstr(position))
ELSE BEGIN
writeln('Need file #', position : 1, ' specification');
halt; END;
END; (* attachfile *)
(* 2---------2 *)
BEGIN (* initialize *)
attachfile(infile, 1); attachfile(outfile, 2); (* << ADDED *)
reset(infile);
rewrite(outfile);
END; (* initialize *)
(* 1---------1 *)
PROCEDURE getdata(VAR infile : text; VAR buffer : buff);
(* We might want this to read one word, as an example of a different
*)
(* program using the same basic model, or to ignore non-numeric
input *)
(* For numeric input, we could send an error message if none found.
*)
(* Instead of using the Turbo string type, we could have *)(* <<
NOTE *)
(* made a different buff definition, such as: *)
(* buff = RECORD *)
(* lgh : integer; *)
(* data : ARRAY[1..maxbuff] OF char; *)
(* END; *)
(* and altered this procedure to read into it, setting *)
(* the lgh value. Then the putdata would have needed *)
(* only minor modification, such as: *)
(* writeln(buffer.data : buffer.lgh); *)
(* except that Turbo fails to use the field correctly. *)
BEGIN (* getdata *)
readln(infile, buffer);
END; (* getdata *)
(* 1---------1 *)
PROCEDURE putdata(VAR outfile : text, buffer : buff);
(* And this might want to upshift everything, or insert tabs, etc.
*)
BEGIN (* putdata *)
writeln(outfile, buffer : length(buffer)); (* << NOTE *)
(* For Turbo strings we can omit the " : length(buffer)" *)
16
(* clause above. If the buff type was still really an *)
(* array of char we would need something like : *)
(* FOR i := 1 TO length(buffer) DO *)
(* write(f, buffer[i]; *)
(* to compensate for the Turbo failure to truncate. *)
END; (* putdata *)
(* 1---------1 *)
PROCEDURE cleanup;
BEGIN (* cleanup *)
close(outfile); (* << ADDED - Turbo doesnt automate
*)
END; (* cleanup *)
(* 1---------1 *)
BEGIN (* something *) (* << The outer block doesnt change
*)
initialize;
WHILE NOT eof(infile) DO BEGIN
getdata(infile, buffer);
putdata(outfile, buffer); END;
cleanup;
END. (* something *)
-------
More style notes: BEGIN and END are used to delimit compound
statements in Pascal, and not to show control flow. Modula uses
END to show the end of controlled statements, and implies the BEGIN
after IF, WHILE, etc. Thus I feel extra indentations for BEGINs
are pure confusion (and lead to programs that run off the right of
the page), and subordinate them. I also usually attach the END to
the end of the controlled code, as shown, and use a separate line
only when multiple ENDS must appear. This helps to keep code
vertically compact, so that the whole control flow can be seen at
a glance. In Pascal BEGIN END really delimit blocks only when
used at the start and finish of PROCEDURE or FUNCTION code.
Similarly, CASE labels are really LABELS, and belong at the left of
the source listing. They are not controlled statements, but
reference points to which control flows, eg:
CASE ch : char OF
'A', 'a': dothingswitha;
'B', 'b': dothingswithb;
'C': BEGIN
makeanoise;
dothingswithbigC;
END;
17
'c': dothingswithlittlec;
END; (* case *)
and, on reading, it is quite obvious that only one path will be
followed, and how to find that path.
18
PASCAL I/O semantics, File PASCALIO.DOC
C.B. Falconer 85/9/11, 87/2/12, 88/11/20, 89/12/12
This is an explanation of the action of Pascal I/O, as applied to
text files. A system meeting the ISO and ANSI standards is
assumed. This does not apply to Turbo Pascal exactly, because
Turbo omits some of the standard abilities and functions,
especially for console input (but Turbo 4 up is closer). UCSD
Pascal fails in console i/o, but other operations are implemented.
PascalP functions exactly as described below.
REMEMBER - THIS IS ANSI/ISO STANDARD PASCAL. The standard has
been published and in effect since 1983. The standard does NOT
forbid extensions, but does insist on compliance. In addition a
system meeting the standard MUST have a way of disabling any and
all extensions, so that source can be checked for portability.
Any Pascal file is conceptually a single stream, with a file
buffer variable. If we always refer to the file variable itself
as "f", the buffer variable is "f^". If f is declared as "f :
FILE OF thing", then f^ is of type "thing", and may be used as
such a variable at any time the file is open (i.e. after the file
has been reset or rewritten).
A Pascal text file is equivalent to "PACKED FILE OF char", and
additionally specifies that the eoln, readln, writeln procedures
may be used. THESE MAY NOT BE USED ON A NON-TEXT FILE.
Reading
-------
For reading, a file at any time consists of two ordered arrays of
items. The first is the portion that has already been input, and
the second is the portion that has not been input yet. The
buffer variable f^ always contains the last single item input
(consisting of characters, an eof mark, and an eoln mark for text
files). The eoln mark always appears as a space in f^, and may
only be detected by the eoln procedure. The eof mark in any non-
empty text file must immediately follow an eoln mark (specified
by the standard). (Thus any good system will automatically
append an eoln on closing a file, if and only if it is not
already present.) The second portion of the file is unlimited,
and unknown as yet to the Pascal program.
When a file is "reset" the file is actually opened, and the first
char is placed in f^ (this may be the eof or eoln mark, checked
by eof/eoln functions). This first char is removed from the
second portion.
If reset is applied to an already open file, the effect is always
to rewind it (if possible). Obviously some files cannot be reset,
e.g. it is hard to make the console repeat its previous sequence,
19
and thus "reset(input)" may be an error on some systems.
From here on, the action of the "get(f)" procedure is to advance
one further character in the source file, discarding the old f^
value, and replacing it with the next char. It should always be
an error to do this when eof is true.
Note that nothing has yet affected any variable in the Pascal
program, except the f^ buffer. These are the underlying
functions of the input system. The program may use the file by
such actions as "ch := f^" at any time.
The syntax of "read(f, ch)" is STRICTLY defined as "ch := f^;
get(f)", and the eoln and eof functions examine the non-visible
characteristics of the last input character. If "f" is omitted,
as in "read(ch)" the standard file "input" is assumed, and the
buffer variable is "input^".
For most CPM or MSDOS systems the file actually contains a <cr>
to mark eoln, and a <^Z> to mark eof. The value of f^ when eof
is true is not defined by the standards, but when eoln is true it
should be a space. Thus the <cr> character can not appear (unless
the system defines eoln as the <cr,lf> pair. Some systems always
discard any <lf>, so that the file action remains the same when
input from a keyboard as when input from a disk file.
The syntax of "read(f, ch1, ch2, ..)" is defined as "read(f,ch1);
read(f,ch2); .... ", and is simply a shorthand. If the object
read-into is an integer, or a real, then automatic conversion is
performed from a text string, and at completion f^ holds the
terminating character (space, non-numeric, etc). Such a read
causes a run-time error when no valid integer etc. is found
before a terminator, but leading blanks (and eolns) are skipped
over.
Notice that nothing so far controls any flushing of input lines,
to ensure that a read starts on the next physical line. This is
performed by "readln(f)", which is defined as "WHILE NOT eoln(f)
DO get(f); get(f)". NOTE the final get. This always leave f^
holding the first character of the next line (which is a space if
the next line is empty, i.e. consists of eoln alone), or possibly
an eof mark. Again, an omitted "f" implies input.
The syntax of "readln(f, item1, item2, .. itemn)" is defined as
"read(f,item1); read(f,item2); ... read(f,itemn); readln(f)", and
is again just a convenient shorthand.
Interactive files
-----------------
This brings up the great bugaboo of Pascal text i/o: When a file
is reset it MUST place the first character in f^. If that file is
20
interactive (i.e. the keyboard) the first character must be typed
at that time. Thus the natural sequence "reset(f); write('prompt
message'); read(f, ch)" to get a reply to a prompt requires that
the answer be typed before the prompt is made. The problem also
reappears after any readln, because the first "get" from the next
line is performed. (see below for why f^ is filled at all)
This is normally cured by a special driver for text files.
Whenever the "get" is executed it simply sets a flag somehere
(totally invisible to the application program) which says "a get
is pending". (If get finds the flag set it must perform the
pending get, and then again set the flag). Note that the "get"
may be implied by a reset, read, or readln operation. Now the
system must again intercept any use of eoln, eof, or the f^
variable and, before actually executing them, check the
"get_pending" flag. If set the actual get must be performed, the
flag reset, and then the eoln, eof, f^ references may be made.
This prevents the early physical read, and allows natural
programming. However the programmer should always remember that
any reference to eof, eoln, or f^ will cause the physical read.
Thus the sequence "reset(f); IF eof(f) THEN something;
write('prompt'); read(f,ch)" will cause the physical read to be
too early.
Some systems (e.g. UCSD) do not follow the ANSI/ISO standard, and
define a special interactive file type where read(f, ch) is
defined as "get(f); ch := f^". This causes all sorts of problems,
because the programmer must always know that this file is
interactive, and programs cannot use the standard input and
disk files interchangably.
The "get" is normally executed on reset (or readln) so that the
value of eoln and eof is available after using a character (by
read), and so that the program can look ahead to the next
character. This allows decisions to be made, i.e. is the
following character numeric.. then read a number; or is it alpha
.. then read a char; or is it a special .. then read a user
command etc. Thus a file copy program such as:
WHILE NOT eof DO BEGIN
WHILE NOT eoln DO BEGIN
read(ch); write(ch); END;
readln; writeln; END;
works naturally. The read/write line can be replaced by
write(input^); get(input); END
or by some sort of filter such as
IF input^ <> ' ' THEN write(input^);
get(input); END;
21
to strip out all blanks ith the same action and no auxiliary variable.
Such a fragment can copy the standard input to standard output,
and works correctly with any i/o redirection applied.
NOTE that "reset(input)" is always automatically performed when a
program begins running, and similarly "rewrite(output)". Thus
such statements should normally not appear in a program.
Think of readln as a line-flushing procedure, but bear in mind
that "readln(item)" is always equivalent to "read(item); readln".
Writing
-------
Again, all files consist of 2 parts, that which has been written
(and is in the external file), and that which has not. The next
item to be written is ALWAYS in f^, and "put(f)" is defined to do
the writing, and leave f^ contents undefined (in practice, most
systems leave f^ unchanged).
Similar to read, "write(f, item)" is STRICTLY defined to be the
sequence "f^ := item; put(f)". For text files ONLY, writeln can
put an eoln mark in the external file. Most systems, including
PascalP, can in practice create an eoln mark by writing suitable
characters, BUT THIS IS NOT portable.
Write(f, item1, item2, .. itemn) is STRICTLY defined as being
"write(f,item1); write(f, item2); ... write(f, itemn)", and
"writeln(f, item)" is defined as "write(f, item); writeln(f)".
Both of these are again shorthand. The writeln procedure alone
(i.e. writeln(f) ) simply puts an eoln mark into the file being
written. If the "f" specification is omitted the write is
shipped to "output" file by default.
Again, the fundamental writing procedure is "put(f)", which
causes the content of f^ to be appended to the end of the file f.
"write(f, item) is STRICTLY defined as "f^ := item; put(f)", and
should be unable to create the eoln mark in a text file (reserved
for writeln). The action of "rewrite(f)" is to empty any old
version of f, and leave f^ undefined. f^ is also undefined after
any write operation. Thus doing nothing except "rewrite(f)" in a
program should leave f as an empty file, but existing.
All Pascal files should be automatically closed when the defining
program (or procedure for a local file) is exited. Some systems
provide a "close" procedure to force an early close for one
reason or another (e.g. to release a locked file to another user
in a multi-process environment). If a file was open for write
(via rewrite), and is later "reset", an automatic close is done.
These closings of a written file append the eof mark, and force
any system buffers to be flushed. Some systems are incomplete,
and actually require that a specific call to "close" be made.
22
This procedure is non-standard, and such programs will not be
portable.
Text file writes
----------------
The standard specifies the formatting abilities for various types
of objects. The use of fields for integers and reals is complied
with in general, however
write(f, astring : field)
is defined to RIGHT justify astring in field positions, AND TO
TRUNCATE astring (on the right) when it is longer than field. An
item of type astring is compatible with PACKED ARRAY[1..] OF char.
Failure to do this truncation is a major impediment to portablity
in Turbo Pascal.
To harp on the action of writeln, this marks the end of a text
line and usually flushes any buffers. The actual implementation
of the file system should not matter, provided it meets the
abstract requirements of Pascal. Thus systems that never put an
eoln marker on the disk, but simply retain pointers to the line
endings, should function correctly.
Fixes
-----
Again, this is how it should work according to international (and
ANSI) standards. Some systems do not meet the standards - beware.
The so-called string type in Turbo and UCSD Pascal is not really
necessary, although it is often a convenience. It is possible to
create systems that are fully Standard compatible, and still have
string reads, concatenation, dynamic length, extraction, etc. All
such operations are available in PascalP as extended system pro-
cedures, and thus porting to another system only requires writing
such procedures. When not provided as part of the system package
efficiency may suffer, but no logical change in the program will
be needed.
For this reason I suggest that Turbo users should ALWAYS use the
concat function for strings, rather than the "+" operator. When
this is done the program can be converted fairly easily to an ISO
standard system.
For Turbo Pascal users, I have written a set of includable
procedures or units (see TURBOFIX.LBR or TURBOFIX.ARC for TP3,
TXTFInnn.LZH for TP4 up, with nnn 121 at present) which make
Turbo meet these standards, although you will have to use
non-standard procedure names. These are available without charge
23
for non-commercial use.
I hope this clears up some confusion.
24
All Sorts of Sorts
By Pete Davis
Every programming student comes across sorting at one time or
another. Occasionally, one comes across sorting in their own work.
Picking the right sort for the job isn't always an easy process. Many
things need to be taken into consideration. First of all, what will
the typical size for the data be? Some sorts work better for a lot of
data and others work better with small amounts. Other factors include
knowing what kind of data-structure you're working with. If you're
using a linked list, certain sorts might not be worth the trouble, but
if you're using an array, it might be ok to use those sorts.
I want to offer a range of sorts here. Each one will include a
short algorithm and some code examples will be included with this
issue of PNL. With each sort, I'll give a description of how it works,
what it's good points are and what it's problems are. I'll explain
what kind of applications it is best for and when to avoid it. There
are many programming books that go into a lot of depth on sorting, so
if you really want to get into the meat of sorting, I suggest you
check out your public library or local bookstore.
There are several ways to measure the quality of a sorts. Average
number of comparisons and average number of exchanges are very common
to get an idea of how much work the sort has to do for a particular
amount of data. i.e. average number of comparisons for 100 items in a
bubble sort is about 5000, depending on exactly how the algorithm is
implemented. Other important factors are the actual time it took to
execute the sort and memory requirments for the sort.
A. Bubble Sort
--------------
Talk about basic, the bubble sort is about the most basic and
easy to understand of the sorts. It's simplicity gives way to slow
execution and many comparisons. For small amounts of data (i.e. 100 or
fewer data items) the bubble sort will usually suffice. The exact
amount of data really depends on the size of the data-items and the
algorithm used. The only time where the bubble sort is the best sort,
is if the data is already in order. Otherwise, chances are, most other
sorts will beat the bubble sort, especially if the lowest value is
near the end of the list or the highest value is near the beginning of
the list.
The bubble sort uses the simple concept of going through a list
and checking each item against the item next to it. If they are out of
order, then put them in order. It keeps repeating this process until
it reaches the end of the list. Then it starts over at the beginning
of the list again, and keeps repeating this whole process until no
more items are switched. You can either check to see if no more items
are switched or you can go through N-1 times where N is the number of
items in the list. The prefered method is to check for whether or not
anything is switched. This means that if the list is already in order,
25
you won't have redundant passes through the list. The following will
show a group of characters out of order. I will then show what would
happen with each pass of the bubble sort.
START -> F E R M T A P
STEPS: { E&F SWITCH : F&R IN ORDER : R&M SWITCH
R&T IN ORDER : T&A SWITCH : P&T SWITCH }
END OF PASS 1 -> E F M R A P T { # exchanges: 4 }
END OF PASS 2 -> E F M A P R T { # exchanges: 2 }
END OF PASS 3 -> E F A M P R T { # exchanges: 1 }
END OF PASS 4 -> E A F M P R T { # exchanges: 1 }
END OF PASS 5 -> A E F M P R T { # exchanges: 1 }
I only showed all of the steps for the first pass. I believe it
is straight-foward enough that it was all that was necessary. You can
examine the code for more examples. Notice how much work it took to
sort only 7 items, though. there were 5 passes, and each pass
consisted of 6 comparisons and at most, 4 exchanges and at best, 1
exchange with a total of 9 exchanges and 30 comparisons. This isn't a
worse case scenario, but it is what I will compare several of the
sorts against. After that, we'll take some actual results from our
test program.
B. Selection Sort
-----------------
The selection sort is kind of like putting a hand of playing
cards in order. First thing you do is look at your whole list. You
take the lowest value and put it in the beginning. Then you take the
next lowest value and put it right after the first. (Or you could take
the highest and then the next highest and do it backwards, it really
doesn't matter.)
I don't think any sort of diagrams or algorithms need to be
explained for this. I think everyone is familiar with this technique.
Although the selection sort isn't incredibly fast, it is very simple
and does give a marginal improvement over the bubble sort, in general.
C. Insertion Sort
-----------------
The insertion sort can also use the idea of a hand of playing
cards, but instead of working with all of your cards at once, you take
a card one at a time and insert it where it belong as you get it. The
insertion sort is fairly simple, but has some problems. The insertion
sort is ideal for linked lists. If an insetion needs to be made, all
you have to do is re-direct some pointers. The insertion sort is much
less powerful when used with arrays, or contiguous lists. The problem
is that when you insert a new value, chances are you'll average moving
about half of the data in the list over one space. When your list gets
long, that can cause major problems. Each time something is inserted,
everything in the list would have to shift over.
26
The insertion sort, like the selection sort is very straight-
foward and simple, so I'll just let you examine the code if you have
any more questions.
D. Shell Sort
-------------
The shell sort uses the idea of sorting over an increment. Let's
use a new list of data: F E R M T A P G N Y C
and we'll take an increment of three. In our first step, we will
compare item #1 with item #4 (1+increment of 3) then 2 and 5 and make
exchanges over the increment. Let's see the results after 1 pass:
F E A M G N P C R Y T
We continue sorting with the increment of 3 until everything for
that increment is in order. So, for the next pass we get:
F E A M C N P G R Y T
and the third step:
F C A M E N P G R Y T
now, everything is in order over the increment of 3, so we drop
our increment to 2:
A C E M F G P N R Y T
and one more pass:
A C E G F M P N R Y T
now that that's in order, we drop to the final level with an
increment of 1. At this point, a single pass will put everything in
order:
A C E F G M N P R T Y
For larger amounts of data the results are a bit more dramatic.
The choice of a starting increment of 3 and a decrement of 1 for each
pass was completely ambiguous on my part. You can use any starting
increment and any decrement per pass that you want. The final pass
must be an increment of 1, however. With larger amounts of data, a
starting increment of 5 or higher will be more useful, and keeping the
space as an odd number for each pass is more efficient.
The shell sort is a bit more complicated than the sorts shown
previously but the average gain in speed is tens of times better than
any of the previous sorts.
27
E. Merge Sort
-------------
Although the concept behind the Merge Sort is fairly simple, the
actual coding of the sort is a bit more complex than one would expect.
In simple terms, the Merge Sort breaks your data in half and then
sorts each half. After the halves are sorted it puts them back
together. It's a bit more complex in that the routine calls itself
again and breaks each of the half-piles into halves, and so on and so
on until each pile has either 1 or no items.
You might be wondering why this is so quick, but think of it this
way, the Bubble Sort is called a quadratic sort. That means that if
you double the data to be sorted, it takes more than double the amount
of time to execute the sort. So, it would be quicker to break a big
set of data into two sets of data and sort them. The only extra time
you'd need is to put them back together in order. The results are a
bit more dramatic than that, though, as the piles keep getting broken
down more and more making each individual sort incredibly quick.
The Merge Sort works well with either contiguous data (arrays) or
linked lists. It's a great general purpose sort, but for small amounts
(less than 100 items) of data, it probably isn't worth the effort. The
only real drawbacks to the Merge Sort is that it requires double the
amount of space for data to make copies of the data. Another is that,
because it's recursive, it incurs a bit of overhead on speed and
space. This, however, isn't incredibly significant, however, when
comparing speeds with the previous sorts.
F. Quick Sort
-------------
The Quick Sort is generally recognized as the best general-
purpose sorting algorithm. It is much like Merge Sort in that the data
is recursively broken into two pieces. The first step in the Quick
Sort is to come up with a mid-value for your data. With numbers this
is pretty easy but strings and characters might require a bit more
effort. The way you come up with the mid-value can vary. The method
I'm going to use in the sample provided is to take the mean of the
first and last value in the list. It doesn't have to be the exact
middle, but the more accurate it is, the faster the sort. If you have
some sort of pre-knowledge of what your data will be like, you might
be able to come up with a more accurate method to finding the mid-
point.
The next step is to move all values lower than the mid-point to
the left (or lower end of your list) of the mid-point and all the
higher values to the right. Then you keep recursively splitting the
left and right in half, moving the values to the left and right side
of the mid-point, appropriately. You keep doing this until you are
down to 1 or 0 elements in the left and right. At this point, your
data is in order.
28
Summary
-------
The main point of this wasn't to go completely in-depth
explaining the subtelties of each individual sort, but more to give an
overview of each sort. I wanted the readers to be able to get a basic
idea of how each sort works, find its advantages and dis-advantages,
and be able to decide which sort will work best for your particular
application.
I'll leave most of the more technical explanations in the inline
documentation of the sort unit I am providing. The test program gives
you some stats on each sort that I have explained above.
These routines that I am providing are not necessarily the
fastest or most efficient algorithms. Everyone has their own minor
variations on certain sorts and you may want to change them to suit
your purposes.
29
For Beginners
By Bob Gowans
In the last issue of PNL (no.2) the beginner's column discussed a
simple but complete Pascal program in which the use of the Pascal
standard procedures writeln and write were highlighted. The rules
governing identifiers were also put to you. Continuing on this theme I
would like to show how you can build on this basic layout, and produce
a more substantial program.
Who has not heard of data? In this day and age it seems that
there is an overwhelming amount of data available on all sorts of
subjects. Our computer systems manipulate data - they take data in,
process it and churn it out and there you have it, your electricity
bill! As programmers we are mainly concerned with the manipulation of
data, we write programs that can do all sorts of things with data and
that produce data as results. But what exactly is data and how can we,
as prospective programmers, make use of such data that is suitable for
computers. Data comes in many forms, the actual word data means
'things given', an item of data represents some fact, observation or
idea. We humans can recognize all sorts of data easily - text,
numbers, pictures and speech to mention but a few. However, the poor
computer can only, in most cases, accept data in the form of symbols
such as characters and numbers, they are picking up mind you and there
are devices that enable the input of sound and pictures to computers.
Even with this restricted set of characters and numbers a huge range
of data is available as input to the computer. For example, two
characters can represent true and false, digits allow the computer to
use integer numbers, a decimal point with digits means the computer
can use real numbers,characters will allow us humans to input
languages ( such as Pascal ) into the computer. Once the data is in
the computer it can be processed and this is done according to sets of
instructions or algorithms. This is where we come in, we want to be
able to write algorithms that can manipulate the data that is
available.
'What data is available?', might be a question you would be
justified in asking. Well, Pascal has several different inbuilt types
of data and also there is a facility for creating your own data types
so you have plenty to go by. In Pascal the values True and False make
up the BOOLEAN data type, whole numbers constitute the type INTEGER,
decimal numbers are represented by REAL data types and characters are
represented by the CHAR type. If this seems a little fuzzy to you it
will soon become clear once you learn to implement the various data
types in a Pascal program. Lets do that, sticking to the
implementation of the integer type for now.
Briefly, the store manager wants you to write a program that will
calculate the total number of customers that use his store each day
and output to the screen the total number of customers and the sub
totals of female and male customers. You are given the respective male
and female customer subtotals at the end of each day. In the following
30
program note how the lines are indented, if you copy this layout your
programs will be more readable, but more about that later. Also note
that we have introduced a new Pascal reserved word VAR, it is after
this statement that variables are declared ie. the data types of the
variables are declared. Do not worry about variables for the time
being as we will again deal with these later. However, note that we
can assign values to variables using the symbol := this is a very
important and fundamental feature of Pascal.
program add_up; { Program heading }
var
female : integer;
male : integer;
total : integer;
begin
female := 250;
male := 101;
writeln('The total number of customers today was ',total);
writeln('The number of male customers was ',male);
writeln('The number of female customers was ',female)
end.
In the program we have three variables which are declared to be
of type integer, that is to say that the values we are going to give
to these variables must be of an integer data type. Variables are
declared following the Pascal reserved word VAR and are given a data
type as follows
variable : datatype;
We could have written the variable declarations in our program as
follows
female,male,total : integer
Each variable is separated by a comma then comes a colon before
the data type. As the variables are of type integer, the computer now
knows what type of data to store in them and will reserve space at a
memory location for three integer data types, and this is crucial - it
cannot store any other type of data in the locations it has reserved
for our three variables. If, for example, we tried to give female a
value of 1.5 then the computer would go haywire ( at least the
computer knows there is no such a thing as half a female ). Seriously
what would happen is that the computer would issue an error message
and the program would probably abort, which would not be to good if
you had just spent the last half hour entering data. At this point it
would be as well to define the data type integer.
INTEGER - the positive and negative whole numbers.
31
Incidentally the range of integers that you will be able to use
depends on your computer/compiler, the range for Turbo Pascal using a
640K RAM IBM Compatible is 32767 to -32768 for those who are
interested.
The situation with our program is that we have declared three
variables of type integer and now we are going to write some
statements in our program that can play around with these variables -
make them do some work for us! We can think of a variable as a memory
location, with an attached name or identifier which the computer uses
to access the contents of the variable. The rules for naming variables
are the same that apply to naming identifiers which were discussed in
the last issue of PNL. Why are they called variables? because as the
name variable infers they can be updated or given new values. If we
look at the program add_up we can see that female has been given a
value of 250, male a value of 101 and total a value of the sum of male
and female but these values are not fixed values as the program is
updated every day so the values of male, female and total will change.
In the program we say we have assigned a value of 250 to female and
assigned a value of 101 to male. Finally we have assigned the value of
the sum of male and female to total and written out the results.
With each program that you write it is a good idea to plan a data
table. Choosing the right data structures for variables and deciding
which is the right data type to use is sometimes the programmers most
difficult task, a data table can assist you greatly in helping to make
things more clear. Lets write a data table for our program - in later
issues you will be made to realise that it is better to plan and
design the program first before writing the actual code.
Identifier ( variable) Brief description Data type
male holds current total of
male customers. integer
female holds current total of
female customers integer
total holds the current total
of male + female integer
You will have no doubt noted from the program that the value of a
variable can be output to the screen, or to a printer or a file for
that matter, and that this is done by using the writeln statement
typically as follows
writeln(variablename);
This will have the effect of outputting to the screen the current
value stored in the variable. The screen output from our program would
thus be:
32
The total number of customers was 351
The number of male customers was 101
The number of female customers was 250
At the same time as outputting our variables we also took the
opportunity to output a relevant textual comment. The string that we
want to output must be in quotes and separated from the variable name
by a comma (,) and the whole writeln statement is enclosed in
brackets. I wonder if you noticed that there is space inserted between
the last character in the textline and the final quotation mark in the
writeln statement, the space is placed there so that the output from
the variable is not written directly at the end of the text line, such
as : The number of female customers was205 - resulting in messy
screen output. A small point but well formatted text output is a must
for today's user friendly programs.
The assignment statement can do more than just give a single
value to a variable, the assignment statement in Pascal has the
following form :
variable := expression
When the computer executes such a statement, the expression on
the right-hand side of the assignment operator (:=) is evaluated and
the value is given to the variable on the left-hand side. If it helps
you try reading the assignment operator as 'becomes' ie variable
becomes expression.
In this article we have learnt how to develop our programming
skills by realising that the declaration and correct use of data types
is an essential an important part of program development and that if a
data type is declared the it can only be used for the purpose ie. data
for which it was intended. We also discovered how to make assignment
statements and how to perform certain operations on variables
including writing them out to the standard output ( screen ). In the
next article we will further expand on these basic concepts and
hopefully introduce to you how to read in values to variables, by user
input.
The answer to the exercise given in the last article of PNL is :
writeln(''How many times have I told you, don''t do that;she said'');
This shows you that if you want to output text in quotation marks
then you must double up on quotation marks in the writeln statement.
33
Self Test
---- ----
(a) The following sequence of statements appears in a pascal program.
All variables are declared to be of type integer.
apples := 10;
oranges := 12;
bananas := 6;
fruit := oranges + bananas - apples;
apples := fruit - apples;
Write out the final value of fruit, apples, oranges and bananas.
(b) The variable number is declared to be of type integer. Which of
the following Pascal statements are valid :
(1) number := 10;
(2) number := number + 1;
(3) 100 := number;
(4) number = 100;
(5) number := 23.5;
(6) number := 100 + 15 - 205;
(C) You are asked to write a program that counts the number of
automobiles that are sold from a showroom. What data type would you
choose to represent the total number of cars sold? How would you
implement this problem in Pascal?
Solutions
---------
(a) fruit = 28, apples = 18, oranges = 12, bananas = 6.
(b) (1) valid.
(2) valid.
(3) invalid ( you can only assign a value to a variable if they
are on the
left of the assignment operator).
(4) invalid ( the assignment operator is := ).
(5) invalid ( number is of type integer and you cannot assign to
it a real
value ).
(6) valid.
(c) integer as automobiles are best represented by whole numbers.
program car_add;
var
34
total,autos : integer;
begin
total := total + autos;
writeln('The total automobiles sold are ',total)
end.
Obviously this program is incomplete as we require some method of
inputting the automobile data into the variable autos. This would be
done by writing prompts to the screen so that the salesman could enter
information, the program would read in such information. We will cover
this in the next article.
For fun
-------
The following program is intended for user's of Turbo Pascal only
and makes use of the Turbo Pascal crt unit to produce sound on your
PC's internal speaker. The main program is incorporated in a loop.
Sound is used to get the attention or make the user aware of errors or
as a security measure to deter unwanted users operating the system. As
we become more proficient we will be able to put these advanced
features to better use in our programs.
program alarm;
uses crt;
var
i : integer;
begin
for i := 1 to 100 do { loop start }
begin
sound(1500);
delay(10);
sound(500);
delay(500);
nosound;
delay(1500)
end { loop finish }
end.
Bob Gowans
35
The Object of Objects
By Jim Reid
With version 5.5 of Turbo Pascal, Borland added one of the newest
features in modern programming languages--the concept of objects. The
good news is that object-oriented Pascal programming is a major step
forward for programmers. The bad news is that if you don't already
know something about object- oriented programming (OOP for short),
Borland's version 5.5 OOP Guide won't help you very much. The first
page of Chapter 1, for example, drops on you such household concepts
as "encapsulation", "inheritance", and "polymorphism." Ugh! I hope
this short article can help a bit by talking about objects more
plainly.
So what is an object anyway? An object is a group of Pascal
variables (of any type) and a group of functions and procedures that
operate on them. In short, it is a programming technique that lets
you create new conceptual data types (like CIRCLES) and new conceptual
operators (like DRAW or SHRINK) for them.
Here's a simple example. One way to define a window area on your
display screen is by specifying two coordinates--the upper left (X,Y)
position and the lower right (X,Y) position. So, you might think of a
WINDOW_AREA as an object with four variables. Here's how you would
start to define this as an object.
Type
Window_Area = Object
Upper_X ,
Upper_Y ,
Lower_X ,
Lower_Y : byte;
{...more to follow here later...}
End;
As you can see, an object looks a little like a RECORD TYPE.
But there's nothing strange or unfamiliar about how you define
the variables of an object. You use the same type of definitions
you would use in the VAR statement.
If this were all that objects had to offer, you'd wonder what all
the fuss was over them. But defining the data variables of an object
is only the first part of the job. Next, you define the procedures
and functions that can operate on these variables. Using our example,
suppose you wanted to have one procedure that sets the window
coordinates, another procedure that defines and clears the windowed
area on the screen, and a third procedure to disable it. Let's expand
the object definition a bit further to do this:
36
Type
Window_Area = Object
Upper_X ,
Upper_Y ,
Lower_X ,
Lower_Y : byte;
Procedure Init (byte X1, Y1, X2, Y2);
Procedure Enable;
Procedure Done;
End;
Procedure Window_Area.Init (byte X1, Y1, X2, Y2);
Begin
Upper_X := X1;
Upper_Y := Y1;
Lower_X := X2;
Lower_Y := Y2;
End;
Procedure Window_Area.Enable;
Begin
Window(Upper_X,Upper_Y,Lower_X,Lower_Y);
ClrScr;
End;
Procedure Window_Area.Done;
Begin
Upper_X := 1;
Upper_Y := 1;
Lower_X := 80;
Lower_Y := 25;
Window(1,1,80,25);
End;
Notice several things about the object's definition now. First, the
object TYPE defines not only the variables but also a set of
procedures. The PROCEDURE definitions look just the way you would
define them in the INTERFACE portion of a Turbo Pascal UNIT--that is,
you define the name and parameters of the procedure, but not the
statements in it.
Later in your program, below the object definition, you include the
body of each of these procedures. Notice that each of the procedure
names is prefixed with the name of the object type ("Window_Area" in
this case). The procedures (or functions) defined for an object
automatically have access to the variables of that object. Thus,
these procedures can refer to "Upper_X" without having to qualify
where that variable was defined.
Within the body of an object's procedures, things look the same as
usual. These procedures can be passed variables, or not. They can do
37
anything that any typical Pascal procedure can do. They are simply
linked to their object's variables by reference.
There's nothing magical about the names INIT and DONE, but Turbo
Pascal's programming convention suggests that you name the procedure
which initially defines an object's variables as INIT, and the
procedure which finally terminates the object as DONE. If you don't
like those names, you are free to use whatever names you prefer.
Now that you've defined the object TYPE and its procedures, to
actually define an instance of the object (like our Window_Area) you
simply define it as a VAR. For example:
Var
My_Window : Window_Area;
To use your defined object, you execute one of the object's defined
procedures or functions. Here's a short example:
My_Window.Init(20,5,60,20);
My_Window.Enable;
WriteLn('See. It worked!');
Delay(1000);
My_Window.Done;
This example would define the window as the area between (20,5) and
(60,20), enable the window and clear it, write a short message in the
window display area, pause briefly, then reset the window as the
entire screen.
Notice that you invoke an object's procedures just like any other
Pascal procedures, but you prefix the procedure name with the name of
a specific object variable ("My_Window" in this case).
This may seem a little odd at first, but the concept is fairly
simple. Think of an object as a RECORD variable that is tightly
linked to a set of procedures and functions. Think of invoking an
object's procedure as if you had simply called the procedure normally,
passing it all the defined parameters plus the object's record
variable. If you think of it in that way, you can see that this much
of Turbo Pascal's object oriented programming extension is simply a
shorthand method for passing record variables to procedures.
There is much, much more to objects than this, however. Objects can
be constructed out of the variables and procedures of other objects.
For example, if you were writing a graphics program, you might define
a POINT as an object. The point has a single (X,Y) coordinate. You
might define a CIRCLE object as a POINT plus a radius. And you might
define a SPHERE object as a CIRCLE plus a 3-Dimension boolean flag.
The advantage which OOP offers you is that each object can be built on
the foundation laid by the predecessor objects. That is, higher level
objects inherit the data variables and procedures of lower level
38
objects on which they are defined. Objects also let you define to the
procedure at execution time which objects they will act on. (This
feature is called dynamic methods and polymorphism.) I'll try to
write more on these concepts in a later article.
The bottom line is that object-oriented programming can be used to
make your programs more flexible and easier to test. By
"encapsulating" the variables of an object and the procedures that
operate on them, you simplify the debugging process. By building
layers of objects, you simplify the coding process by reusing
functions and procedures of predecessor objects. Eventually, after
enough use of objects, this technique will change the way you
structure and design your programs. You will learn to think first
about the structure of your data, and second about the functions you
perform on the data.
If you'd like to see a longer example of using objects, I have
included TP unit named TEXTLIST.ZIP. This unit can be used to build
and manage a linked list of textual values. You might use this to
build a simple editor, for example. Enjoy, and start becoming
familiar with Turbo Pascal's object-oriented programming!
39
Doing More With Strings
By Gerhard Hoogterp
Fidonet 2:283/1.2
BitNet HOOGTERP@HENUT5
One of the things most programmers who switch from Basic to
Pascal notice first is the limited number of procedures and functions
Turbo Pascal has available to work with strings. Standard Pascal, not
having strings at all is even worse. Actually it's been a long time
since I programmed in Basic, but the need for some more advanced
string routines was there when I was working on a text adventure. My
string toolbox grew and became more advanced over the years, and so it
evolved into what I present here.
For my purposes, I defined a string as a list of tokens separated
by a delimiter. A sentence like:
This is a sentence, with seven words.
^ ^ ^ ^^ ^ ^ ^^end of string
Contains seven tokens and 4 delimiters. At some places the
delimiters are double, like the ', ' combination. The fourth delimiter
is the end of string Which, although not explicitly used by the
routines, is there anyhow.
As the toolbox was intended for use with a natural language
parser I couldn't put restrictions on the delimiters, or the way they
should be used. So the first routine to write was
Procedure SimplifyDelimiters( Var TokenList : String;
Delimiters : String);
It takes a string and changes double delimiters into single
delimiters. So the ', ' combination would be replaced with only
space.
To use this function with the above example would be:
Sentence:='This is a sentence, with seven words.';
SimplifyDelimiters(Sentence,' ,.');
40
The second function needed was a way to separate the string into
a head and a tail. To find the first token. This token was of course
delimited by a delimiter again.
Function FindToken(Var TokenList : String;
Delimiters : String):String;
When this function is used on the example sentence like
Sentence:='This is a sentence, with seven words.'
Token:=FindToken(Sentence,' ,.');
it will return the string 'This'. The delimiter is removed and the
sentence becomes 'is a sentence, with seven words.'
These two routines provide the user with a lot of power when, for
example, reading configuration files. SimplifyDelimiters accepts a
string in a user readable way and turns it into something more
suitable for a computer, the FindToken takes the first token (the
keyword in a CFG file) which can be converted to a number or a
subrange type to use in a CASE Statement.
To make life a little easier there are some functions in the toolbox
that alter the complete string too. Of course there are the
UpStr(Var TokenList : String) and
DownStr(Var TokenList : String)
functions to turn a string in all uppercase/lowercase, a
SkipLeadingSpaces(Var TokenList : String)
Which deletes the leading spaces (Spaces are almost always used as
delimiters, so leaving them there would, even after a
SimplifyDelimiters, always let the string start with an empty token)
DeleteTrailingSpaces(Var TokenList : String)
Which deletes the trailing spaces of a string for the same reason
as the SkipLeadingSpaces function. That leaves only two procedures in
the toolbox, which are useful in the original set up of the unit.
DeleteNoise(Var TokenList : String; Noise : String);
Deletes all the characters in the NOISE string from the
TokenList. When the user is free to type what he likes (as in an
41
adventure game) there are always characters which shouldn't be there.
DeleteNoise deletes those characters. It's even more useful when used
together with the other left procedure
ReplaceToken( Var TokenList : String;
Tok1 : String;
Tok2 : String);
which is used to change a certain token to an other token or
character.. For example, in a typical adventure sentence one could
type:
'Take the blue sword and go to the north'
In this sentence the token 'AND' is a delimiter. It separated two
commands. So it would be useful to change the 'AND' in for example '&'
which can be used by the FindToken to break the sentence in two
pieces.
Sentence:='TAKE THE BLUE SWORD AND GO TO THE NORTH';
ReplaceToken(Sentence,
'AND',
'&');
would do the trick. There are also some words in the sentence that
don't add any information to the command. Their only used because it's
english. The words 'THE' and 'TO' can be removed from the sentence. So
with
Sentence:='Take the blue sword and go to the north';
ReplaceToken(Sentence,'THE',' ');
ReplaceToken(Sentence,'TO',' ');
ReplaceToken(Sentence,'AND','&');
SimplifyDelimiters(Sentence,' ');
Results in: Sentence = 'Take blue sword & go north'
which is more suitable for a computer than the original sentence. An
other approach is to change all the "noise" words to a noise character
and delete that character with the delete noise procedure.
42
Bits & Pieces
Bits & Pieces is going to be a sort of gathering ground for small
routines and things that readers or maybe even me will contribute to.
It's basically quick and dirty routines that can't really make up a
whole article. Individual authors and programmers will be given credit
with their routines. This issues Bits & Pieces is going to be a bit
scarce, having only one item, but I think it will give readers an idea
of the kinds of things I'm looking for in it and maybe get them to
contribute to it.
Bits & Pieces #1 : Nathan S. Phillips
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Here is a procedure to display a number with commas (12345
becomes 12,345). I wrote this procedure some time ago for my "DSIZ -
Directory Size Utility" program. The program displays the size of a
directory (among many other things), and a display like "C:\FOO -
13245336 bytes" would be a little confusing - is that 1, 13, or 132
megs? But display the sizes with commas, such as "C:\FOO - 13,245,336
bytes", and the user can easily see how big the directory is. But
enough history - on to how the procedure does what it does. The
procedure takes the number you want to display with commas in a value
parameter of type real. It uses the real type so that large values
may be passed - only the integer portion of the number is used.
Basically, it takes a number and either:
1. Displays the number if it doesn't need
any commas (0-999)
2. Calls itself, passing the number divided
by 1000, if the number needs a comma
(greater than 999)
When the procedure calls itself, the remaining instructions are
"stacked", and executed when the call is complete. Therefore, the
procedure keeps calling itself (without displaying anything) until it
gets to the leftmost digit(s) that are displayed before the first
comma. It displays that portion, and returns. Then the stacked
instructions start executing, printing the commas and zero-padded
numbers. When all of the stacked instructions have been executed, the
entire number will have been printed.
Like many recursive procedures, it is an elegant, compact solution to
the problem. And, like many recursive procedures, it is difficult to
follow, especially if you don't use recursion often. (I should say
difficult for Pascal programmers to follow; Lisp programmers *think*
recursively.)
43
procedure DisplayWithCommas(r : real); { displays 1024 as 1,024 }
var t : real;
begin { NON-RECURSIVE CASE if r is less than 1000 }
if r < 1000 then { if this part is under a thousand, it doesn't }
write(r:1:0) { need zero padding or commas, so just print it }
{ (This would be the 1 in 1,024) }
else
begin { RECURSIVE CASE - will need at least one comma }
{ call this procedure again }
DisplayWithCommas(int(r/1000)); { with the same number, but }
{ without the last 3 digits }
write(','); { display the comma }
t := r - int(r/1000)*1000;{get the value of the last 3 digits}
{ (will be 24 for 1024) }
if t < 100 then write('0');{pad with 0s so the routine will}
if t < 10 then write('0'); { print 1,024 rather than 1,24 }
write(t:1:0); {now just write the value of the last 3 digits}
end;
end; { procedure DisplayWithCommas }
44
Conclusion
I like to do things in order. I wrote the introduction first
thing, and I'm finishing off with the Conclusion, after everything
else has been put togather. At this point, I'm speechless. I would
like to thank all of the wonderful people who contributed to this
issue and previous issues. We had a lot of article submitted this
month. If people can keep sending them in, the Pascal NewsLetter might
really gain some momentum.
I hope you've enjoyed this issue as much as I have. I've enjoyed
reading all of the articles and I've enjoyed putting them together for
you. I've been blown away by all the contibuted articles. It's really
been exciting to see the response. I am constantly getting
suggestions, comments, and good criticism for the newsletter. I'm glad
there are so many people out there who enjoy it. It makes it worth my
time.
I'm not really sure what I'm going to do for the fourth issue. I
do have some ideas written down here and there, but nothing firm yet.
I'm sure users will keep sending in the articles which will keep us
alive. If I had to rely on myself for each issue, I can assure you
they would only come out once every two months, and wouldn't be nearly
as good. I think I'm going to be able to put new issues out about once
every 3 or 4 weeks, which to me seems reasonable. If you'd like to see
it come out more often, then pitch in and send in some articles.
I'm still open to contributed articles. Please send them in. I
ask only one thing more: Please do not send the documents in
formatted. It is almost impossible to work with the documents which
are justified, double-spaced, etc... Please send them in single-
spaced and hard returns only at paragraph breaks. I'll take care of
the rest once it's integrated into the newsletter.
Again, I would like to thank everyone, and please keep the
comments and suggestions coming, but more importantly, keep the
article submissions comming!!!
_Pete Davis
Editor
45
The Pascal NewsLetter is Copyrighted by Pete Davis.
Articles submitted by others are the property of those
authors and are used with permission. They may not be
treated seperately from this newsletter without the
author's permission and thus maintain all distribution
rules of the newsletter as a whole. It may be freely
distributed in un-modified form, but no charge
whatsoever may be incurred on the recipient. All code
is provided 'as-is' with no guarantees whatsoever.
The Pascal NewsLetter can be obtained from the
following locations:
GEnie : General Electric Network for Information
Exchange. It is located in the IBMPC filelist.
Simtel: Internet address: 26.2.0.74 or Simtel20.ARPA It
is located in the <MSDOS.PASCAL> directory.
Programmer's Forum : This is the home of the PNL. Each
issue can be found in the MAG
directory from the main area.
The number is on the title page.
If you would like to receive back issues of PNL
directly from me, send a diskette and $2.00 for
shipping. Don't forget to include your address.
Send your order to:
Peter Davis
4851 Brandywine St. NW
Washington, DC 20016
If you are a SysOp that will regularly carry PNL and
would like to have your bulletin board listed as such,
here, send me a message either by postal mail or at one
of the electronic addresses given on the title page,
with your bulletin board's name, phone number, and your
name.
46
Distribution List
The following is the phone numbers to bulletin boards known
to carry PNL. If you would like your bulletin board's name and
number added to or deleted from this list, please send me a
message at one of my many addresses. I can not guarantee whether
a listed board will have any particular issue, however.
The Programmer's Forum ................. Phone: (202) 966-3647
The Programmer's Forum is the home of the PNL.
The Bored .............................. Phone: (512) 303-0471
Classic City ........................... Phone: (404) 548-0726
Theive's World ......................... Phone: (713) 463-8053
Hippocampus ............................ Phone: (203) 484-4621
Rogers State College BBS ............... Phone: (918) 341-8982
The Foxtrot BBS ........................ Re-Locating
Turbo City BBS ......................... Phone: (209) 599-7435
Austin IEMUG/MIDI BBS .................. Phone: (512) 528-0626
Laser Publishers ....................... Phone: (918) 438-2749
Fargo RBBS-PC .......................... Phone: (701) 293-5973